From the PO.DAAC Cookbook, to access the GitHub version of the notebook, follow this link.

Working with SWOT Level 2 Water Mask Raster Image Data Product:

Local Machine Download Version

Author: Nicholas Tarpinian, PO.DAAC

Summary & Learning Objectives

Notebook showcasing how to work with multiple files from the SWOT Raster Image data product version 2.0 on a local machine

  • Utilizing the earthaccess Python package. For more information visit: https://nsidc.github.io/earthaccess/
  • Option to query the new dataset based on user’s choice; choosing between two resolutions either by ‘100m’ or ‘250m’.
  • Visualizing multiple raster images on a single map.
  • Stacking multiple raster images and creating a time dimension to analyze over time.
  • Adjusting images based on quality flag

Requirements

1. Compute environment

This tutorial is written to run in the following environment: - Local compute environment e.g. laptop, server: this tutorial can be run on your local machine

2. Earthdata Login

An Earthdata Login account is required to access data, as well as discover restricted data, from the NASA Earthdata system. Thus, to access NASA data, you need Earthdata Login. Please visit https://urs.earthdata.nasa.gov to register and manage your Earthdata Login account. This account is free to create and only takes a moment to set up.

Import libraries

import os
from os.path import isfile, basename, abspath
import xarray as xr
import numpy as np
from datetime import datetime
from pathlib import Path
import hvplot
import hvplot.xarray 
import earthaccess

Authentication with earthaccess

In this notebook, we will be calling the authentication in the below cell.

auth = earthaccess.login()

Search for SWOT Raster products using earthaccess

Each dataset has it’s own unique collection concept ID. For the SWOT_L2_HR_Raster_2.0 dataset, we can find the collection ID here.

For this tutorial, we are looking at the Lake Mead Reservoir in the United States.

We used bbox finder to get the exact coordinates for our area of interest.

results = earthaccess.search_data(
    short_name = 'SWOT_L2_HR_RASTER_2.0',
    bounding_box=(-115.112686,35.740939,-114.224167,36.937819),
    temporal =('2024-02-01 12:00:00', '2024-02-01 23:59:59'),
    granule_name = '*_100m_*',
    count =200
)
Granules found: 2
earthaccess.download(results, "data_downloads/SWOT_Raster_LakeMead/")
 Getting 2 granules, approx download size: 0.07 GB
Accessing cloud dataset using dataset endpoint credentials: https://archive.swot.podaac.earthdata.nasa.gov/s3credentials
Downloaded: data_downloads/SWOT_Raster_LakeMead/SWOT_L2_HR_Raster_100m_UTM11S_N_x_x_x_010_218_045F_20240201T183814_20240201T183835_PIC0_01.nc
Downloaded: data_downloads/SWOT_Raster_LakeMead/SWOT_L2_HR_Raster_100m_UTM11S_N_x_x_x_010_218_046F_20240201T183834_20240201T183855_PIC0_01.nc
['data_downloads/SWOT_Raster_LakeMead/SWOT_L2_HR_Raster_100m_UTM11S_N_x_x_x_010_218_045F_20240201T183814_20240201T183835_PIC0_01.nc',
 'data_downloads/SWOT_Raster_LakeMead/SWOT_L2_HR_Raster_100m_UTM11S_N_x_x_x_010_218_046F_20240201T183834_20240201T183855_PIC0_01.nc']

Visualizing Multiple Tiles

Let’s now visualize multiple raster tiles in a folder and explore the data.

Utilizing xarray.open_mfdataset which supports the opening of multiple files.

folder_path = "data_downloads/SWOT_Raster_LakeMead/"

file_paths = [os.path.join(folder_path, file) for file in os.listdir(folder_path) if file.endswith('.nc')]
ds = xr.open_mfdataset(file_paths, combine='nested', concat_dim='x')
ds
<xarray.Dataset>
Dimensions:                  (y: 2784, x: 3074)
Coordinates:
  * y                        (y) float64 3.899e+06 3.899e+06 ... 4.177e+06
  * x                        (x) float64 5.682e+05 5.683e+05 ... 6.928e+05
Data variables: (12/39)
    crs                      (x) object b'1' b'1' b'1' b'1' ... b'1' b'1' b'1'
    longitude                (y, x) float64 dask.array<chunksize=(512, 512), meta=np.ndarray>
    latitude                 (y, x) float64 dask.array<chunksize=(512, 512), meta=np.ndarray>
    wse                      (y, x) float32 dask.array<chunksize=(768, 768), meta=np.ndarray>
    wse_qual                 (y, x) float32 dask.array<chunksize=(2784, 1536), meta=np.ndarray>
    wse_qual_bitwise         (y, x) float64 dask.array<chunksize=(768, 768), meta=np.ndarray>
    ...                       ...
    load_tide_fes            (y, x) float32 dask.array<chunksize=(768, 768), meta=np.ndarray>
    load_tide_got            (y, x) float32 dask.array<chunksize=(768, 768), meta=np.ndarray>
    pole_tide                (y, x) float32 dask.array<chunksize=(768, 768), meta=np.ndarray>
    model_dry_tropo_cor      (y, x) float32 dask.array<chunksize=(768, 768), meta=np.ndarray>
    model_wet_tropo_cor      (y, x) float32 dask.array<chunksize=(768, 768), meta=np.ndarray>
    iono_cor_gim_ka          (y, x) float32 dask.array<chunksize=(768, 768), meta=np.ndarray>
Attributes: (12/49)
    Conventions:                   CF-1.7
    title:                         Level 2 KaRIn High Rate Raster Data Product
    source:                        Ka-band radar interferometer
    history:                       2024-02-05T08:50:33Z : Creation
    platform:                      SWOT
    references:                    V1.2.1
    ...                            ...
    x_min:                         568200.0
    x_max:                         721700.0
    y_min:                         3898600.0
    y_max:                         4052100.0
    institution:                   CNES
    product_version:               01
    • y
      PandasIndex
      PandasIndex(Float64Index([3898600.0, 3898700.0, 3898800.0, 3898900.0, 3899000.0, 3899100.0,
                    3899200.0, 3899300.0, 3899400.0, 3899500.0,
                    ...
                    4176000.0, 4176100.0, 4176200.0, 4176300.0, 4176400.0, 4176500.0,
                    4176600.0, 4176700.0, 4176800.0, 4176900.0],
                   dtype='float64', name='y', length=2784))
    • x
      PandasIndex
      PandasIndex(Float64Index([568200.0, 568300.0, 568400.0, 568500.0, 568600.0, 568700.0,
                    568800.0, 568900.0, 569000.0, 569100.0,
                    ...
                    691900.0, 692000.0, 692100.0, 692200.0, 692300.0, 692400.0,
                    692500.0, 692600.0, 692700.0, 692800.0],
                   dtype='float64', name='x', length=3074))
  • Conventions :
    CF-1.7
    title :
    Level 2 KaRIn High Rate Raster Data Product
    source :
    Ka-band radar interferometer
    history :
    2024-02-05T08:50:33Z : Creation
    platform :
    SWOT
    references :
    V1.2.1
    reference_document :
    JPL D-56416 - Revision C - December 8, 2023
    contact :
    podaac@podaac.jpl.nasa.gov
    cycle_number :
    10
    pass_number :
    218
    scene_number :
    46
    tile_numbers :
    [90 91 92 93 90 91 92 93]
    tile_names :
    218_090L, 218_091L, 218_092L, 218_093L, 218_090R, 218_091R, 218_092R, 218_093R
    tile_polarizations :
    H, H, H, H, V, V, V, V
    coordinate_reference_system :
    Universal Transverse Mercator
    resolution :
    100.0
    short_name :
    L2_HR_Raster
    descriptor_string :
    100m_UTM11S_N_x_x_x
    crid :
    PIC0
    pge_name :
    PGE_L2_HR_RASTER
    pge_version :
    5.1.1
    time_granule_start :
    2024-02-01T18:38:34.265833Z
    time_granule_end :
    2024-02-01T18:38:55.360807Z
    time_coverage_start :
    2024-02-01T18:38:34.807712Z
    time_coverage_end :
    2024-02-01T18:38:54.816495Z
    geospatial_lon_min :
    -116.24045666354958
    geospatial_lon_max :
    -114.55674281329392
    geospatial_lat_min :
    35.225727504637376
    geospatial_lat_max :
    36.5951883116925
    left_first_longitude :
    -114.84459394683485
    left_first_latitude :
    36.5951883116925
    left_last_longitude :
    -114.55674281329392
    left_last_latitude :
    35.46460890403489
    right_first_longitude :
    -116.24045666354958
    right_first_latitude :
    36.350992807631336
    right_last_longitude :
    -115.93433361296329
    right_last_latitude :
    35.225727504637376
    xref_l2_hr_pixc_files :
    SWOT_L2_HR_PIXC_010_218_090L_20240201T183824_20240201T183835_PIC0_01.nc, SWOT_L2_HR_PIXC_010_218_091L_20240201T183834_20240201T183845_PIC0_01.nc, SWOT_L2_HR_PIXC_010_218_092L_20240201T183844_20240201T183855_PIC0_01.nc, SWOT_L2_HR_PIXC_010_218_093L_20240201T183854_20240201T183905_PIC0_01.nc, SWOT_L2_HR_PIXC_010_218_090R_20240201T183824_20240201T183835_PIC0_01.nc, SWOT_L2_HR_PIXC_010_218_091R_20240201T183834_20240201T183845_PIC0_01.nc, SWOT_L2_HR_PIXC_010_218_092R_20240201T183844_20240201T183855_PIC0_01.nc, SWOT_L2_HR_PIXC_010_218_093R_20240201T183854_20240201T183905_PIC0_01.nc
    xref_l2_hr_pixcvec_files :
    SWOT_L2_HR_PIXCVec_010_218_090L_20240201T183824_20240201T183835_PIC0_01.nc, SWOT_L2_HR_PIXCVec_010_218_091L_20240201T183834_20240201T183845_PIC0_01.nc, SWOT_L2_HR_PIXCVec_010_218_092L_20240201T183844_20240201T183855_PIC0_01.nc, SWOT_L2_HR_PIXCVec_010_218_093L_20240201T183854_20240201T183905_PIC0_01.nc, SWOT_L2_HR_PIXCVec_010_218_090R_20240201T183824_20240201T183835_PIC0_01.nc, SWOT_L2_HR_PIXCVec_010_218_091R_20240201T183834_20240201T183845_PIC0_01.nc, SWOT_L2_HR_PIXCVec_010_218_092R_20240201T183844_20240201T183855_PIC0_01.nc, SWOT_L2_HR_PIXCVec_010_218_093R_20240201T183854_20240201T183905_PIC0_01.nc
    xref_param_l2_hr_raster_file :
    SWOT_Param_L2_HR_Raster_20000101T000000_21000101T000000_20230817T100000_v302.rdf
    xref_reforbittrack_files :
    SWOT_RefOrbitTrackTileBoundary_Nom_20000101T000000_21000101T000000_20200617T193054_v101.txt, SWOT_RefOrbitTrack125mPass1_Nom_20000101T000000_21000101T000000_20200617T193054_v101.txt, SWOT_RefOrbitTrack125mPass2_Nom_20000101T000000_21000101T000000_20200617T193054_v101.txt
    utm_zone_num :
    11
    mgrs_latitude_band :
    S
    x_min :
    568200.0
    x_max :
    721700.0
    y_min :
    3898600.0
    y_max :
    4052100.0
    institution :
    CNES
    product_version :
    01
  • raster_plot = ds.wse.hvplot.quadmesh(x='x', y='y', rasterize=True, title=f'SWOT Raster 100m: Lake Mead Reservoir')
    raster_plot.opts(width=700, height=600, colorbar=True)

    Creating a Time Series

    SWOT Raster product does not include a time dimension, each file is a snapshot in time, but it can be inserted by extracting from the file name.

    1. Expand the time range of your earthaccess search to get an adequate range.
    2. Extract the datetime from the s3 file name then concatenate based on the new time dimension.
    3. Save the output as a new NetCDF file.
    time_results = earthaccess.search_data(
        short_name = 'SWOT_L2_HR_RASTER_2.0',
        bounding_box=(-114.502048,36.060175,-114.390983,36.210182),
        temporal =('2024-01-25 00:00:00', '2024-03-04 23:59:59'),
        granule_name = '*_100m_*',
        count =200
    )
    Granules found: 3
    earthaccess.download(time_results, "data_downloads/SWOT_Raster_LakeMead_Time/")
     Getting 3 granules, approx download size: 0.12 GB
    Accessing cloud dataset using dataset endpoint credentials: https://archive.swot.podaac.earthdata.nasa.gov/s3credentials
    Downloaded: data_downloads/SWOT_Raster_LakeMead_Time/SWOT_L2_HR_Raster_100m_UTM11S_N_x_x_x_010_205_109F_20240201T075048_20240201T075109_PIC0_01.nc
    Downloaded: data_downloads/SWOT_Raster_LakeMead_Time/SWOT_L2_HR_Raster_100m_UTM11S_N_x_x_x_010_496_046F_20240211T170050_20240211T170111_PIC0_01.nc
    Downloaded: data_downloads/SWOT_Raster_LakeMead_Time/SWOT_L2_HR_Raster_100m_UTM11S_N_x_x_x_011_205_109F_20240222T043554_20240222T043615_PIC0_01.nc
    ['data_downloads/SWOT_Raster_LakeMead_Time/SWOT_L2_HR_Raster_100m_UTM11S_N_x_x_x_010_205_109F_20240201T075048_20240201T075109_PIC0_01.nc',
     'data_downloads/SWOT_Raster_LakeMead_Time/SWOT_L2_HR_Raster_100m_UTM11S_N_x_x_x_010_496_046F_20240211T170050_20240211T170111_PIC0_01.nc',
     'data_downloads/SWOT_Raster_LakeMead_Time/SWOT_L2_HR_Raster_100m_UTM11S_N_x_x_x_011_205_109F_20240222T043554_20240222T043615_PIC0_01.nc']
    folder_path = "data_downloads/SWOT_Raster_LakeMead_Time/"
    file_paths = [os.path.join(folder_path, file) for file in os.listdir(folder_path) if file.endswith('.nc')]
    def add_time_dimension(ds):
        # Extract date/time string from filename
        file_date = os.path.basename(ds.encoding['source']).split("_")[-4][:15]
        # Convert the date string to a datetime object
        time_value = datetime.strptime(file_date, "%Y%m%dT%H%M%S")
        # Assign the time coordinate to the dataset
        ds.coords['time'] = time_value
        return ds
    datasets = []
    file_names = []
    
    for file_path in file_paths:
        dataset = xr.open_dataset(file_path)
        datasets.append(add_time_dimension(dataset))
        file_names.append(os.path.basename(file_path))
        dataset.close()
    # sorting the time dimension in correct order
    datasets.sort(key=lambda ds: ds.time.values)
    ds2 = xr.concat(datasets, dim='time')
    ds2
    <xarray.Dataset>
    Dimensions:                  (x: 1549, y: 1549, time: 3)
    Coordinates:
      * x                        (x) float64 6.788e+05 6.789e+05 ... 8.336e+05
      * y                        (y) float64 3.9e+06 3.901e+06 ... 4.055e+06
      * time                     (time) datetime64[ns] 2024-02-01T07:50:48 ... 20...
    Data variables: (12/39)
        crs                      (time) object b'1' b'1' b'1'
        longitude                (time, y, x) float64 nan nan nan ... nan nan nan
        latitude                 (time, y, x) float64 nan nan nan ... nan nan nan
        wse                      (time, y, x) float32 nan nan nan ... nan nan nan
        wse_qual                 (time, y, x) float32 nan nan nan ... nan nan nan
        wse_qual_bitwise         (time, y, x) float64 nan nan nan ... nan nan nan
        ...                       ...
        load_tide_fes            (time, y, x) float32 nan nan nan ... nan nan nan
        load_tide_got            (time, y, x) float32 nan nan nan ... nan nan nan
        pole_tide                (time, y, x) float32 nan nan nan ... nan nan nan
        model_dry_tropo_cor      (time, y, x) float32 nan nan nan ... nan nan nan
        model_wet_tropo_cor      (time, y, x) float32 nan nan nan ... nan nan nan
        iono_cor_gim_ka          (time, y, x) float32 nan nan nan ... nan nan nan
    Attributes: (12/49)
        Conventions:                   CF-1.7
        title:                         Level 2 KaRIn High Rate Raster Data Product
        source:                        Ka-band radar interferometer
        history:                       2024-02-05T12:55:01Z : Creation
        platform:                      SWOT
        references:                    V1.2.1
        ...                            ...
        x_min:                         680100.0
        x_max:                         829300.0
        y_min:                         3903300.0
        y_max:                         4052400.0
        institution:                   CNES
        product_version:               01
    timeplot = ds2.wse.hvplot.image(y='y', x='x')
    timeplot.opts(width=700, height=500, colorbar=True)

    Let’s plot the wse quality flag, wse_qual which ranges 0-3 where 0=good, 1=suspect, 2=degraded, 3=bad (as described when printing variable with xarray).

    timeplot = ds2.wse_qual.hvplot.image(y='y', x='x')
    timeplot.opts(width=700, height=500, colorbar=True)

    Masking a variable with its quaility flag

    variable_to_mask = ds2['wse']
    mask_variable = ds2['wse_qual']
    # Define the condition for masking based on the range of the quaility flag
    mask_condition = mask_variable < 2
    
    masked_variable = variable_to_mask.where(mask_condition)
    # Update the masked variable in the dataset
    ds2['wse'] = masked_variable
    
    ds2['wse'].hvplot.image(y='y', x='x').opts(width=700, height=500, colorbar=True)

    Our end product is a time series of the data showing only the values where the quality flag is either good (0) or suspect (1).

    We can also save the time series with quality flags implemented to a new netcdf file.

    output_folder = 'data_downloads/SWOT_Raster_Time/'
    os.makedirs(output_folder, exist_ok=True)
    output_netcdf_path = os.path.join(output_folder, "Output_Time.nc")
    ds2.to_netcdf(output_netcdf_path)
    
    print(f"Output complete: {output_netcdf_path}")
    Output complete: data_downloads/SWOT_Raster_Time/Output_Time.nc

    Appendix: Alternate Plot

    # # Alternate plotting with matplotlib
    # %matplotlib inline
    
    # import matplotlib.pyplot as plt
    # from matplotlib import animation
    # from matplotlib.animation import FuncAnimation, PillowWriter
    # from IPython.display import display, Image, HTML
    
    # variable_name = 'wse'
    # data = ds2[variable_name]
    
    # fig, ax = plt.subplots(figsize=(10, 8))
    # fig.set_tight_layout({'rect': [0.01, 0.01, 1.0, 1.0]})
    
    # contour = ax.contourf(data.isel(time=0), cmap='viridis')
    # cbar = plt.colorbar(contour)
    # cbar.set_label('Water Surface Elevation (meters)', fontsize=14) 
    # times = ds2.time.values
    
    # # Function to update the plot for each time step
    # def update(frame):
    #     ax.clear()
    #     contour = ax.contourf(data.isel(time=frame), cmap='viridis',)
    #     formatted_time = str(times[frame])[:-7]
    #     ax.set_title(f'Date: {formatted_time}')
    #     ax.set_xlabel('Longitude', fontsize=14)
    #     ax.set_ylabel('Latitude', fontsize=14)
    #     ax.text(0.5, 1.05, 'SWOT Raster 100M Lake Mead Reservoir', transform=ax.transAxes, ha='center', fontsize=14)
    #     return contour,
    
    # # Creating a gif animation
    # ani = animation.FuncAnimation(fig, update, repeat=True, frames=len(data['time']), blit=True, interval=3000)
    
    # output = ('./time_series.gif')
    # ani.save(output, writer='pillow', fps=.5)
    
    # with open(output,'rb') as f:
    #     display(Image(data=f.read(), format='gif'))
    
    # plt.close(fig)
    # ds2.close()